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In  Search  of  the  Components  of  Task 
Induced  Judgment  Decrements 
Betty  S.  Goldsberry 

Abstract 

Hasssond's  Cognitive  Continuum  Theory  (Hammond^  1980)^ posits  that  three 
major  categories  of  task  features  (content,  structure,  and  presentation) 
determine  how  "analytically*  or  *•  intuitively**  the  individual  will  process 
information  in  arriving  at  a  judgment.  The  state  of  the  art  does  not  permit  a 
direct  test  of  this  basic  assumption.  Hpt^eyer,  Some  of  its  implications  can  be 
examined  by  manipulating  the  task  features  systematically  and  observing  the 
effects  upon  both  Judgmental  products  and  processes.  This  was  the  main  purpose 
of  the  two  experiments  presented  in  this  report. 

The  task  format  used  in  these  experiments  involved  judging  the  suitability  of 
hypothetical  job  applicants  for  various  positions.  An  optimal  model  was 
available  for  integrating  the  predictive  Information  so  that  actual  selection 
outcomes  (success,  failure)  could  be  simulated.  Comparison  of  outcomes  derived 
from  human  judgment  with  those  derived  from  the  optimal  model  provided  an  index 
of  "product"  quality;  policy  capturing  techniques  provided  an  index  for 
"process"  evaluation. 

A 

One  content  variable,  the  quantity  of  predictive  information  to  be 
Integrated,  and  one  structural  variable,  the  availability  of  an  explicit 
integrative  strategy,  were'  combined  factorlally  in  a  mixed  design  in  experiment 
1.  The  structural  variable,  availability,  was  manipulated  more  precisely  (four 
levels  of  precision)  together  with  a  presentation  variable,  time  permitted  for 
judgment,  in  a  similar  mixed  design  in  experiment  2.  Key  results  indicated 
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that  (a)  human  Judgments  were  more  accurate,  relative  to  a  theoretical  optimum, 
when  fewer  than  four  items  are  to  be  integrated;  (b)  the  decline  in  accuracy 
with  quantity  was  reflected  in  process  measures;  (c)  availability  of  an  ideal 
weighting  strategy  did  little  to  ameliorate  the  quantity  effect — people  do  as 
well  with  a  simple  ordinal  approach;  (d)  accuracy  declined  as  a  function  of 
time  pressure;  (e)  the  four  strategies  reduced  to  two  in  actual  use  by  the 
subjects,  an  equal-weighting  and  an  ordinal-weighting  policy;  and  (f)  the 
superiority  of  the  ordinal  policy  declined  with  time  pressure.  These  results 
support  various  predictions  from  the  Cognitive  Continuum  Theory  and  identify, 
for  further  exploration,  a  range  of  task  conditions  at  the  transition  from 
primarily  "intuition-inducing"  to  primarily  "analysis-inducing." 


INTRODUCTION 


A  common,  critical,  and  rather  intensively  studied  aspect  of  human 
decision  making  is  diagnosis  (or  Inference)  baaed  on  equivocal  evidence  (Payne, 
1982).  How  (and  how  well)  people  carry  out  this  function  is  a  central  issue 
in  the  design  of  decision  systems,  ranging  from  personnel  selection  (Roose  & 
Doherty,  1976)  to  medical  diagnosis  (Nystedt  &  Magnusson,  1975)  to  a  host  of 
military  applications  (Brady  &  Rappoport,  1973).  If,  as  has  been  shown 
repeatedly,  man's  "intuitive"  capability  for  using  evidence  is  limited,  it 
stands  to  reason  that  some  form  of  "aiding"  might  help  him  produce  judgments  of 
consistently  higher  quality.  And,  indeed,  a  variety  of  aids  such  as 
"bootstrapping,"  Bayesian  aggregation  programs  and  the  like  have  already 
appeared  in  operational  systems  (Slovlc  &  Lichtenstein,  1971). 

From  a  practical  standpoint,  of  course,  decision  aiding  is  not  without 
drawbacks.  In  some  situations,  for  example,  it  is  simply  infeasible  (e.g.,  the 
exigencies  of  battle  often  make  timely  access  to  a  machine  capability 
impossible).  In  others,  it  may  be  unnecessary  (e.g.,  if  the  Improvement  in 
decision  quality  is  not  enough  to  justify  the  cost).  And,  even  when  it  could 
be  of  value,  users  often  mistrust  machine  aiding  --  particularly  if  the 
resulting  diagnosis  is  inconsistent  with  their  own  intuition.  Add  to  these 
problems  the  growing  suspicion  that  the  case  against  human  diagnostic 
capability  may  have  been  overstated  (see,  for  example,  Einhorn  &  Hogarth, 

1981),  and  it  becomes  clear  that  the  whole  concept  requires  some  rethinking. 
Aiding  can  obviously  be  of  significant  practical  value;  the  question  is,  under 
what  circumstances? 

Several  recent  trends  in  decision  research  bear  upon  this  issue.  One  is 
the  redirection  of  attention  from  human  deficiencies  to  task  influences. 

People  are  neither  purely  "heuristic"  processors  of  diagnostic  evidence  nor 
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optimal  rule-followers.  How  they  function  depends  to  a  great  extent  upon  task 
features  and,  perhaps,  upon  Individual  differences  as  well  (Payne,  1980). 
Therefore,  it  becomes  Important  to  consider  how  man's  cognitive  approach  varies 
with  identifiable  task  dimensions,  and  several  current  lines  of  research 
(including  the  present  one)  have  been  addressing  this  question. 

Another  trend,  which  is  just  now  in  its  formative  stages,  is  the  blending 
of  normative  and  descriptive  approaches  in  the  study  of  inference,  for  years 
the  predominant  emphasis  was  normative  and  axiomatic  —  exploring  human 
performance  in  terms  of  optimal  models.  Then  it  shifted  to  the  descriptive  and 
empirical  —  cataloging  the  various  forms  of  "heuristic”  processing  and  their 
attendant  biases  (Slovlc,  et  al . ,  1977).  Each  emphasis  carried  with  it  a 
different  set  of  methods  and  issues.  However,  if  one  accepts  the  premise  that 
situational  features  can  dictate  how  the  unaided  human  will  approach  a  decision 
task  and  recognizes  the  obvious  fact  that  some  strategies  will  produce  more 
favorable  results  than  others,  it  becomes  clear  that  description  and 
prescription  should  proceed  together.  For  example,  it  is  useful  to  know 
that  decision  makers  (DM's)  attach  certain  weights  to  specific  items  of 
evidence  in  judging  the  qualifications  of  job  candidates  or  in  estimating  the 
likelihood  of  enemy  attack.  However,  these  subjective  weights  take  on  far  more 
practical  significance  when  viewed  against  a  theoretical  optimum  (i.e.,  the 
weights  that  produce  the  best  prediction).  Only  then  can  we  establish  the 
practical  importance  of  whatever  "human  Intuition"  adds  to  or  subtracts 
from  the  decision  process.  Only  then  can  we  begin  to  define  the  circumstances 
under  which  "biases"  become  sufficient  to  warrant  decision  aiding;  but  only  if 
we  have  the  descriptive  data  can  we  speculate  on  how  best  to  implement  aiding. 

The  present  research  was  carried  out  in  accordance  with  both  of  the  above 
trends.  Its  purpose  was  to  identify  ranges  along  several  generic  task 
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dimensions  within  which  the  cognitive  strategies  (or  decisions  processes)  used 
by  OM's  appear  to  shift  and,  at  the  same  time,  to  establish  the  Importance  of 
such  shifts  for  the  decision  product  (i.e.,  the  accuracy  of  decisions). 

The  general  framework  upon  which  both  task  manipulations  and  expected  cognitive 
influences  were  based  was  the  Cognitive  Continuum  Theory  (Hammond,  1980).  In 
this  view,  0M  approaches  decision  tasks  intuitively  (as  in  heuristic 
processing),  analytically  (as  in  rule-based  processing),  or  at  some 
Intermediate  point  on  a  continuum  joining  these  two  extremes  (i.e.,  "bounded 
rationality”).  Where  on  this  cognitive  continuum  he  tends  to  operate  depends 
on  several  principal  dimensions  of  the  task:  (a)  ambiguity  of  task  content,  (b) 
complexity  of  structure,  and  (c)  form  of  task  presentation.  Highly  complex, 
ambiguous,  and  time-constrained  tasks,  for  example,  promote  intuitive 
processing,  particularly  if  DM  is  not  armed  with  the  knowledge  of  optimal 
strategies  or  principles  of  task  organization.  On  the  other  hand,  simpler 
tasks  (ones  in  which  organization  Is  clear,  strategies  are  defined,  and  time 
pressure  is  minimal)  tend  to  encourage  the  analytic  mode.  The  specific 
task  features  that  are  presumed  to  define  these  three  dimensions  are  reproduced 
in  Table  1 . 

Since  there  is  no  convenient  way  to  determine  a  priori  exactly  where  on 
the  continuum  construct  any  set  of  task  conditions  should  lie,  it  is  difficult 
to  submit  the  theory  to  rigorous  test.  Nonetheless,  as  a  means  of  organizing 
the  existing  facts  on  human  judgment  and  of  suggesting  promising  research 
directions,  the  Cognitive  Continuum  Theory  appears  to  have  considerable  merit. 
In  particular,  it  suggests  that  manipulation  of  the  decision  task  along 
specified  dimensions  (singly  or  in  combination)  should  produce  systematic 
changes  in  the  judgment  process  and,  as  a  result,  changes  in  the  decision 
product  as  well.  Rarely  in  past  research  on  judgment  or  decision  making  has 
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systematic  change  of  this  sort  been  examined  within  the  context  of  the  same 
basic  task  scenario.  More  typically,  specific  issues  have  been  addressed  using 
custom-made  laboratory  tasks  —  ones  designed  to  focus  on  judgmental 
heuristics,  deviations  from  Bayes  rule,  nonoptimal  choice,  logical  problem 
solving,  and  so  on.  By  contrast,  the  Cognitive  Continuum  Theory  encourages 
exploration  of  dimensions  that  cut  across  such  task  domains:  from  clearly 
defined  to  ill  defined  situations;  from  simple  to  complex  structures;  from 
time-stressed  to  more  relaxed  settings.  Thus,  it  offers  a  coherent  way  of 
approaching  the  question  raised  at  the  outset:  When  do  conditions  warrant  some 
form  of  aiding  (and  what  judgmental  processes  are  the  most  likely  candidates 
for  help)? 

The  present  research  involved  manipulation  of  task  variables  from  each  of 
the  major  continuum-theory  categories  (see  Table  1)  and  measurement  of  both  the 
Judgmental  products  of  those  manipulations  and  the  processes  by  which  the 
Judgments  were  reached.  The  object  was  to  determine  how  the  accuracy  and 
nature  of  a  common  type  of  Judgment  (cue-based  prediction)  change  as  conditions 
shift  from  a  more  "intuitive"  to  a  more  "analytic"  position  on  the  task 
continuum.  At  what  point  does  unaided  performance  become  seriously  Impaired, 
and  when  that  happens,  can  the  impairment  be  attributed  to  any  particular 
aspect  of  the  judgment  process? 

Two  experiments  were  conducted,  both  using  the  same  basic  prediction  task 
but  differing  in  the  particular  combination  of  features  manipulated.  In  the 
first,  the  quantity  of  predictive  Information  (number  of  cues)  and  the 
availability  of  an  explicit  organizing  strategy  (scheme  for  weighting  the 
cues)  were  varied;  in  the  second,  the  complexity  of  the  explicit  strategy 
and  the  time  available  for  using  it  were  investigated.  In  both  studies, 
performance  was  evaluated  on  the  basis  of  how  closely  judgments  approximated  an 


optimal  organizing  strategy.  One  measure,  hit  rate,  was  simply  the 
percentage  of  occasions  on  which  DM' s  prediction  was  the  same  as  that  produced 
by  the  optimal  strategy.  Another,  achievement .  was  defined  as  the 
correlation  between  DM's  judgments  and  those  produced  by  the  optimal  strategy. 

To  examine  the  judgmental  processes  responsible  for  hit  rate  and 
achievement  scores,  it  was  necessary  to  compute  three  additional  correlations. 
Although  these  require  a  bit  of  explanation,  which  is  properly  reserved  for  the 
next  sections,  suffice  it  to  say  here  that  they  were  indices  of  DM's 
reliability  in  Judgment,  his  demonstrated  understanding  or  knowle'  -*e  of 
the  proper  organizing  rule,  and  his  demonstrated  ability  to  put  is  knowledge 
to  use  (or  to  control  his  judgments  in  a  manner  consistent  with  t  rule). 

In  other  words,  task  conditions  were  manipulated  so  as  to  place  '  sing 
demands  on  the  unaided  human  DM  —  demands  which,  at  some  point,  would  produce 
degraded  performance  (hit  rate  and  achievement ) .  By  tracking 
reliability,  knowlege  and  control  as  well  as  performance  over  these 
same  conditions,  it  was  hoped  that  the  processes  responsible  for  the  decline 
could  be  estimated. 

METHOD 

Basic  Task.  The  Judgment  task  was  chosen  on  the  basis  of  its  common 
usage  in  "policy  capturing"  research,  and  the  likelihood  that  it  would  be 
meaningful  for  a  wide  variety  of  potential  subjects.  It  consisted  of 
evaluating  the  suitability  of  hypothetical  applicants  for  a  secretarial 
position  using  profiles  of  scores  on  nine-point  skill  ratings  as  predictors  (or 
"cues").  In  particular,  subjects  were  required  to  integrate  the  profile 
Information  into  a  single  suitability  rating  for  each  applicant  by 
marking  a  nine-point  graphic  rating  scale. 


The  target  secretarial  position  was  described  in  detail  in  a  written  job 
description.  The  profiles  were  constructed  by  random  selection  of  ratings  from 
normal  dlsributlons  along  each  of  the  orthogonal  skill  dimensions.  Both  the 
normality  and  independence  features  were  explained  to  the  subjects  (so  that 
they  would  not  be  misled  into  searching  for  nonexistent  profile  structures). 
Although  the  information  available  for  making  suitability  judgments  varied  with 
experimental  conditions,  subjects  were  required  to  do  all  calculations  or  other 
operations  on  the  presented  data  mentally. 

The  key  task  manipulations  were  implemented  by  varying  the  number  of  skill 
ratings  in  each  profile  (the  quantity  variable),  the  time  allowed  for 
each  evaluation,  the  presence  or  absence  in  the  instructions  of  a  specific  rule 
for  combining  the  skill  ratings  (the  availability  variable),  and  the 
complexity  of  that  integrating  rule.  Quantity  and  availabi  llty 
were  varied  in  the  first  experiment;  time  and  complexity  in  the  second. 

Procedure  and  Measures.  Each  subject  served  for  three  45  minute 
sessions  scheduled  approximately  one  week  apart.  At  the  beginning  of  each 
session,  written  instructions  were  presented  describing  the  secretat ial 
position,  the  employer  company,  the  assessment  procedure,  and  wh  -tever  strategy 
Information  was  called  for  by  the  experimental  condition.  To  insure  full 
understanding,  these  instructions  were  augmented  as  necessary  through  verbal 
exchange.  Following  the  Instructions,  a  booklet  was  presented  containing  the 
90  applicant  profiles  to  be  rated  during  that  session  (10  practice  and  80 
experimental  profiles  arranged  one  to  a  page),  and  the  subject  simply  worked 
his  or  her  way  through  the  pages  at  a  controlled  pace,  rating  the  suitability 
of  each  profile  in  order.  After  all  judgments  were  completed,  a  questionnaire 
was  administered  probing  the  manner  in  which  the  subject  perceived  and 


approached  the  task. 


As  indicated  earlier,  profiles  were  constructed  by  orthogonal  combination 
of  ratings  drawn  from  three,  four,  or  five  skill  (cue)  dimensions.  These 
dimensions  are  identified  in  Table  2  together  with  an  indication  of  which  ones 
were  used  in  the  various  conditions  of  the  two  experiments.  In  Cxp.l,  number 
of  cues  was  an  Independent  variable;  hence,  only  those  with  an  assigned  weight 
were  Included  in  each  specified  condition  (e.g.,  typing  skill « 
language  prof iciency  and  clerical  ski 1 1  for  the  three-cue 

condition).  In  Exp.  2,  the  four  dimensions  designated  by  asterisks  were  used 
throughout . 

In  both  studies,  the  subject's  overall  suitability  ratings  for  the 
various  profiles  served  as  the  basis  for  two  sets  of  derived  measures:  product 
measures  (hit  rate  and  achievement),  which  indicated  how  closely  performance 
approximated  an  optimuum  weighting  strategy;  and  process  measures 
(reliability,  knowledge,  and  control ) .  which  examined  the  cognitive 
elements  underlying  that  performance.  Computation  of  product  measures  requires 
definition  of  an  optimal  model  relating  cues  (skill  dimensions)  to  criteria. 

In  other  words,  an  "ecological  validity"  relationship  must  exist  before  one  can 
study  the  human's  proficiency  in  dealing  with  it.  The  model  adopted  in  the 
present  case  was  the  standard  linear  regression  approach  commonly  used  in 
multiple-cue  probability  learning  research.  The  weights  assigned  to  the 
various  cues  in  Exp.  1,  for  example,  are  shown  in  Table  2;  those  used  in  Exp.  2 
are  described  in  Table  4.  Hit  rate .  then,  was  simply  the  proportion  of 
a  subject's  suitability  judgments  that  fell  within  a  specified  tolerance 
Interval  around  the  value  generated  by  the  optimal  weighting  rule. 

Achievement  was  the  correlation  between  the  subject's  and  the  model's 


ratings 


The  process  Measures  require  a  bit  more  explanation.  Reliability 
refers  simply  to  the  subject's  consistency  in  judgment  and  was  indexed  by  the 
correlation  between  ratings  made  for  the  same  sets  of  profiles  early  and  late 
in  each  session.  Knowledge .  or  the  subject's  understanding  of  the  optimal 
rule,  was  indexed  by  the  correlation  between  the  optimal  ratings  and  those 
produced  by  a  model  of  the  subject's  "policy."  The  latter,  of  course,  was 
derived  from  the  subject's  actual  rating  behavior  through  the  use  of  linear 
regression  to  "capture"  his  (implicit)  weighting  rule  (Dawes  &  Corrigan,  1974; 
Goldberg,  1974).  Control  was  indexed  by  the  correlation  between  judgments 
predicted  on  the  basis  of  this  "captured  policy"  and  those  actually  produced 
by  the  subject  in  his  profile  ratings. 

The  distinction  among  these  measures  and  the  logic  on  which  they  are 
founded  are  perhaps  best  illustrated  with  reference  to  their  original  source: 
the  Brunswik  lens  model  (Brunswilt,  1956;  Hammond  &  Summers,  1972).  As  shown  in 
Figure  1,  Tucker  (1964)  suggested  that  the  relationship  (r  )  between  Judgments 

a 

( Yg )  and  criterion  measures  (Yg,  or  the  actual  outcome  of  the  events  judged) 
can  be  partitioned  into  several  statistically  independent  components  reflecting 
respectively:  (a)  the  subject's  acquired  knowledge  of  specific  task  properties, 
(b)  his  cognitive  control  in  applying  this  knowledge,  and  (c)  the  degree  of 
predictability  in  the  task  system.  Tucker's  complete  equation  reads  as  follows: 


where 

ra  -  the  correlation  between  correct  (Yg )  and  observed  (Yg)  Judgments; 
G  ■  the  correlation  between  the  linear  prediction  of  the  correct 

Judgment  (Yg)  and  the  linear  prediction  of  the  subject's  judgment 
(Yg)  from  the  cue  values; 
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Rs  ■  Che  multiple  correlation  between  the  cues  and  the  subject's 

Judgments  (Ys),  which  is  also  the  correlation  between  the  subject's 
judgments  (Ys)  and  the  linear  predictions  of  the  subject's  judgments 

<V; 

Re  “  the  predictability  of  the  criterion  (Ye)  from  the  cues  in  the  task 
(which  is  also  the  correlation  between  the  actual  correct  judgments 
(Ye)  and  the  predicted  correct  (Ye)  judgments); 

C  “  the  correlation  between  the  variance  in  the  task  system  and  the 
subject's  judgmental  system  which  is  unaccounted  for  by  G. 

In  the  present  case,  this  equation  can  be  simplified  to: 
ra  “  GRs 

since  specification  of  an  optimal  organizing  (weighting)  strategy  makes  the 
criterion  perfectly  predictable  from  the  cue  values  (i.e.,  Re  *  1.00),  and  the 
optimal  strategy  is  linear  (eliminating  the  need  for  the  right-hand  side  of 
equation  1).  Now,  thus  simplified,  Tucker's  equation  renders  the  distinctions 
asiong  measures  used  in  the  present  experiments  apparent.  The  ra  term 
represents  achievement  which,  as  we  saw,  can  be  decomposed  into  the  two 
"process"  indices:  knowledge  (G)  and  control  (Rs  ).  In  words,  a  DM's 
judgments  are  accurate  (achievement)  to  the  extent  that  they  correspond  to 
the  "true  suitability"  of  the  hypothetical  secretarial  applicants  as  dictated 
by  our  optimal  model  (e.g..  Table  2).  He  can  only  be  accurate  if  he 
understands  this  rule  (knowledge)  and  is  able  to  apply  it  (control) 
with  consistency  (reliability) .  By  having  Indices  of  these  three 
processes,  we  are  in  a  position  to  probe  the  influence  of  the  task  conditions 
suggested  by  the  Cognitive  Continuum  Theory. 


EXPERIMENT  1 


As  indicated  earlier,  a  task  content  variable,  cue  quantity,  and  a 
structure  variable,  rule  availability,  were  manipulated  in  factorial 
combination.  The  purpose  was  to  induce  systematic  shifts  in  performance  which 
could  then  be  analyzed  using  the  process  measures. 

Subjects  and  design.  Twenty-four  undergraduate  psychology  students 
participated  in  exchange  for  extra  course  credit.  Each  was  assigned  at  random 
to  one  of  two  groups  defined  on  the  basis  of  rule  availability.  The 
weighting  group  was  provided  with  the  specific  numerical  strategy  used  to 
generate  the  "optimum"  ratings;  the  regression  weights  shown  in  Table  2  plus  a 
description  of  how  such  weights  should  be  used  in  making  judgments  (the  linear 
weighting  rule).  The  order  group  was  told  only  that  the  cues  were  ordered 
in  importance  for  the  job  of  secretary  as  indicated  in  parentheses  in  Table  2. 
The  quantity  variable  was  manipulated  within  subjects  at  three  levels:  3, 

4,  or  5  cues  as  shown  in  Table  2.  Thus  the  design  was  a  mixed  model  2x3 
factorial  with  12  subjects  per  group. 

Results.  The  principal  findings  for  all  measures  are  summarized  in 
Table  3.  Hit  rate  scores  are  based  on  a  definition  of  accuracy  in  terms  of 
a  .2  cm  tolerance  Interval  around  the  optimal  point  on  the  graphic  rating 
scale.  All  the  other  measures  are  in  terms  of  correlations  as  defined  earlier. 

Considering  first  the  overall  (product)  measures,  both  show  the  expected 
decline  in  performance  as  a  function  of  the  quantity  of  information  to  be 
processed  (number  of  cues).  However,  the  availability  of  an  optimal  strategy 
for  weighting  the  cues  (Group  I  vs.  Group  II)  produced  no  apparent  lmproveswnt 
over  the  simple  ordinal  instructions.  These  obvious  trends  were  supported  by 
an  analysis  of  variance;  quantity  was  highly  significant  for  both  hit  rate 
and  achievement,  F(2,44)  ■  18.69  and  47.20  respectively,  both  p  <  .001; 
neither  availability  nor  its  interaction  with  quantity  approached 
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significance  on  either  index,  £(1,22)  was  less  chan  .3$  in  all  cases. 

Since  the  sasw  pattern  of  significance  occurred  on  the  process  statures  as 
well,  all  were  collapsed  across  groups  as  shown  at  the  bottom  of  Table  3;  these 
results  are  presented  graphically  in  Figure  2. 

One  important  difference  between  the  two  product  performance  measures  is 
the  shape  of  the  decreswnt  over  information  quantity  (see  Figure  2). 

Achievement  (Ach)  scores  declined  linearly  over  all  three  levels  while 
hit  rate  (HR)  dropped  only  between  the  3-cue  and  4-cue  levels.  This 
pattern  was  supported  by  the  Newman-Keuls  test  applied  to  pairs  of  conditions: 
the  3-4  cue  difference  was  reliable  for  both  seaaure*  (p  <  .01);  the  4-5  cue 
difference  was  reliable  for  Ach  (p  -  .01)  but  not  for  HR. 

Turning  to  the  process  measures,  it  is  apparent  that  knowledge  (G)  and 
control  (C)  were,  in  fact,  the  principal  components  in  the  overall  Ach  score.  A 
multiple  regression  analysis  showed  that  G  explained  352  of  the  variance  in 
Ach,  C  accounted  for  an  additional  48Z,  and  reliability  (R)  added  less  than  12 
(a  nonsignificant  lncreswnt).  The  G  component  also  contributed  significantly 
to  the  overall  hit  rate  (HR)  score,  accounting  for  312  of  the  variance;  the 
contribution  of  C,  however,  dropped  to  72,  and  that  of  R  was  again 
nonsignificant  (at  .42).  Looking  at  Fig.  2,  it  is  easy  to  see  how  well  GxC  (the 
product  of  G  and  C)  predicts  the  Ach  function.  One  cannot,  of  course,  "read" 
directly  the  relative  contributions  made  by  the  components  to  the  explained  HR 
or  Ach  variance. 

The  data  for  each  process  Measure  were  analyzed  separately  in  the  same 
fashion  as  for  the  product  scores.  Again,  the  ANOVA  results  showed  a 
significant  quantity  effect  for  all  aeasuresa:  F(2,44)  -  13.01  for  C, 

37.66  for  G,  and  6.67  for  R  ;  p  <  .001,  .001,  and  .003  respectively.  Neither 
availability  nor  its  interaction  with  information  quantity  was  significant 


on  any  Index:  £(1,22)-  1.31,  p  <  .263  for  G;  £  -  (1,22)  -  1.20,  p  <.286 
for  C;  £(1,22)  -  1.37,  p  <  .457  for  ft.  The  Newmen-Keuls  enelysee  suggested 
that  the  3-4  cue  differences  were  reliable  for  all  three  Indices  (p  <  .01),  but 
the  4-5  cue  difference  was  Halted  to  the  G  component  (p  <  .05). 

Despite  the  fact  that  R  was  also  affected  significantly  by  the  quantity 
manipulation,  the  effect  was  not  as  systematic  as  in  the  case  of  the  other 
process  measures:  for  Group  I,  the  4-cue  condition  was  inferior  to  the  others; 
for  Group  II,  the  5-cue  condition  was  uniquely  Inferior.  Since,  as  we  have 
seen,  this  component  accounted  independently  for  very  little  variance  in  either 
product  measure,  it  was  not  considered  worthy  of  further  Interpretation.  It  is 
not  surprising,  of  course,  that  R  should  add  little  unique  predictiveness;  in 
essence,  both  it  and  C  should  index  DM ' s  ability  to  apply  his  own  weighting 
rule  consistently. 

Briefly,  then,  it  appears  that  DM's  ability  to  make  integrative 
Judgsmnts  declines  as  the  number  of  cues  increases,  largely  because  he  is  less 
proficient  at  formulating  an  effective  weighting  policy  (knowledge).  Lack  of  a 
good  strategy,  of  course,  precludes  maximum  achievement.  In  addition,  Ach 
appears  to  suffer  from  DM's  inability  to  apply  his  own  weighting  strategy 
consistently  (control).  As  one  might  expect,  HR  la  not  as  predictable  from 
cognitive  process  measures  as  is  Ach;  in  fact  the  simple  correlations  between 
the  two  product  measures  only  averaged  r  -  .48.  However,  what  predictability 
there  is  for  HR  rests  almost  entirely  with  the  G  component. 

The  fact  that  availability  of  an  optimal  weighting  strategy  did  little  to 
offset  the  decline  in  performance  with  information  load  is  not  surprising;  if 
the  detailed  strategic  information  merely  adds  to  an  already  heavy  processing 
load,  one  would  scarcely  expect  it  to  be  of  much  help.  At  the  lower 
information  levels,  however,  one  would  expect  some  benefit.  Absence  of  an 
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availability  effect  hare  could  reflect  either  a  celling  limitation  (an 
achievement  score  of  around  .90  may  leave  little  room  for  improvement)  or 
possibly  a  tendency  for  DM  to  reduce  the  metric  weighting  scheme  to  a  simpler 
ordinal  one  (in  which  case  the  strategy  would  be  identical  to  that  for  the 
"unavailable”  condition).  In  view  of  this  ambiguity,  the  rule  availability 
manipulation  wae  repeated  —  and  expanded  —  in  Exp.  2. 

EXPERIMENT  2 

In  this  study  a  task  presentation  variable  (response  time  limitation) 
was  examined  in  conjunction  with  a  strategy  manipulation.  Here,  however,  a 
wider  range  of  available  weighting  rules  was  used:  the  simplest  (Note  1) 
consisted  of  four  equal  weights;  the  most  complex  (4),  of  four 
different  weights.  One  rule  (3)  was  identical  to  the  4-cue  condition  of 
Exp.  1.  The  expectation  was  that  the  more  complex  weighting  schemes  would 
encourage  analytic  processing,  particularly  under  a  liberal  time  limitation. 
However,  with  increasing  time  pressure,  DM  would  be  forced  into  a  more 
intuitive  mode  and  overall  performance  (hit  rate,  achievement)  should  suffer. 
As  in  Exp.  1,  the  plan  was  to  examine  the  elements  of  any  such  overall  decline 
using  knowledge  (G),  control  (C),  and  reliability  (R)  measures. 

Subjects  and  Design.  The  48  undergraduate  volunteers  were  assigned 
randomly  to  four  groups,  each  of  which  served  under  one  of  the  weighting  rules 
shown  in  Table  4.  Each  subject  judged  the  same  80  applicant  profiles  under 
three  different  time  limits  (5,  10,  and  15  seconds)  with  order  of  treatment 
balanced  over  subjects  within  each  group.  Order  of  profiles  within  each 
condition  was  randomized.  Thus,  the  design  was  identical  to  that  used  in  Exp. 
1  except  that  the  between-sub jects  variable  (weighting  rule)  was  manipulated 
over  four  rather  than  two  levels. 

Results.  The  principal  findings  are  summarized  in  Table  5.  For  the 


16 


overall  (product)  Matures,  performance  declined  with  both  time 

limitation  and  weighting  strategy  complexity  (If  equal  weighting  of 

cue*  la  considered  the  simplest  rule).  Both  of  these  main  effects  were  highly 

significant.  In  the  case  of  the  weighting  strategy  variable,  the 

£(3,43)  -  9.68,  p  <  .001  for  HR,  and  6.23,  p  <  .001  for  Ach;  for  tlM 

limitation,  the  respective  values  were  £(2,86)  -  11.98  and  13.40,  both 

p  <  .001.  Despite  the  fact  that  the  time  limitation  function  appears  steeper 

under  the  two  simpler  conditions  (Groups  I  and  II)  than  under  the  more  complex 

ones  (Groups  III  and  IV),  the  Interaction  did  not  approach  significance  on 

either  measure:  £(6,86)  -  1.13,  p  <  .34  for  HR;  1.43,  p  <.20  for  Ach. 

These  functions  are  illustrated  In  Figure  3  (HR)  and  4  (Ach)  for  the  various 
strategy  groups,  and  collapsed  across  groups  in  Figure  3. 

Looking  at  Figure  3  and  4,  it  is  apparent  that  the  critical  complexity 
level  is  the  point  at  which  either  more  or  fewer  than  half  the  cues  are  to  be 
weighed  equally:  the  3  and  4  cue  levels  produced  similar  performance  that  was 
generally  superior  to  the  0  and  2  cue  levels  on  both  HR  and  Ach.  Figure  3 
shows  that  the  time  limitation  was  considerably  more  detrimental  to  HR  than  to 
Ach,  particularly  when  fewer  than  10  seconds  was  permitted.  Overall,  the 
correlation  between  HR  and  Ach  scores  was  somewhat  higher  (r  >.62)  than  in  Exp. 
1  (r  -.48). 

The  G  and  C  components  again  accounted  for  most  of  the  variance  in  Ach 
(93Z)  and  a  significant,  though  considerably  smaller,  portion  of  that  for  HR 
( 43Z) ;  while  R  added  a  nonsignificant  2Z  and  3Z,  respectively.  Figure  4  shows 
that  GxC  (the  product  of  G  and  C)  again  yielded  almost  perfect  prediction  of 
Ach  in  that  the  two  lines  on  the  figure  coincide.  While  the  pattern  of 
variance  in  the  two  product  measures  explained  by  these  components  was  not 
identical  to  that  for  Exp.  1,  it  was  generally  quite  similar.  That  is,  G 
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accounted  for  more  of  the  HR  variance,  and  C  for  more  of  the  Ach  variance 
(although  both  C  and  G  plus  their  interaction  contributed  heavily  to  Ach),  and 
theae  tendenciea  were  greatest  under  the  shortest  time  limitation  (5  seconds). 
Unlike  Experiment  1,  however,  C  was  the  clearly  dominant  factor  in  both  HR  and 
Ach  for  the  IS  second  condition  (it  accounted  for  about  five  times  the 
variance  that  G  did  on  both  measures). 

The  separate  ANOVA's  applied  to  the  process  scores  indicated  that  time 
limitation  had  a  significant  effect  on  all  three  measures: 

F(2,86)-22.02  for  C,  7.56  for  G,  and  8.99  for  R;  all  p  <  .001. 

Strategy  Complexity  produced  a  reliable  effect  on  two  measures: 

F(3,43)-3.88,  p  <.015  for  C;  14.37,  p  <.001  for  G  but  only  2.32,  p  <.088 
for  R.  More  noteworthy  than  the  main  effects,  however,  was  a  significant 
interaction  between  the  two  variables  for  the  C  index,  F(6,86)*2.57,  p  < 

.024.  Looking  at  Table  5,  it  is  clear  that  the  interaction  reflects  an 
exaggeration  of  the  time  limitation  effect  under  the  presumably  simplest 
equal -weighting  condition.  What  this  suggests  is  that  the  equal  weighting  rule 
can  be  applied  very  effectively  if  there  is  sufficient  time;  if  not,  the  C 
index  drops  to  a  level  below  that  for  the  "complex"  unequal  weighting  rule. 
Neither  the  G  nor  the  R  measures  showed  an  interaction  between  time  limitation 
and  strategy:  F(6,86)>1.38  and  1.48  respectively,  p  <  .20. 

It  is  of  some  Interest  to  compare  the  results  for  Croup  III  under  the  10 
second  condition  (see  Table  5)  with  those  for  Group  II  under  the  4  cue 
condition  in  Exp.  1  (see  Table  3)  since  they  were,  for  all  practical  purposes, 
the  sasw.  The  pattern  across  the  various  measures  was  quite  similar;  however, 
the  absolute  level  of  performance  was  uniformly  higher  in  the  present  study. 

The  only  plausible  explanation  is  that  processing  a  fixed  number  of  cues — even 
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with  variable  time  constraints — Is  less  confusing  than  having  to  integrate  a 
variable  number  of  cues. 

Coaparing  the  principal  ways  of  varying  task  difficulty  in  the  two  studies 
(nuaber  of  cues  to  be  integrated  vs.  time  in  which  to  integrate  thea) , 
it  would  appear  that  nuaber  tends  to  have  a  greater  effect  on  the  G  component, 
and  tiae,  on  the  C  component.  That  is,  increasing  the  nuaber  of  cues  to  be 
processed  makes  it  harder  for  DM  to  foraulate  an  appropriate  strategy;  reducing 
the  tiae  available  makes  it  harder  to  apply  his  strategy  consistently. 

DISCUSSION 

The  research  plan  was  to  manipulate  task  variables  in  a  manner  designed  to 
induce  performance  decrements  and,  by  indexing  concurrent  changes  in  judgment 
processes,  to  identify  the  major  components  of  those  decrements.  It  was  hoped 
that  the  information  yielded  by  this  approach  would  have  implications  for 
"decision  aiding"  in  one  coaaaon  type  of  judgment  task. 

As  expected,  both  increasing  the  amount  of  information  to  be  processed 
(from  3-5  cues)  and  decreasing  the  available  time  (from  15-5  seconds  per 
judgaMnt)  reduced  the  overall  accuracy  of  DM’s  judgments  (HR  and  Ach) .  Making 
the  judgment  rule  more  complex  (unequal  weights  assigned  to  more  than  half  the 
cues)  also  significantly  reduced  accuracy.  Making  it  less  explicit 
(ordering  rather  than  weighting  the  Importance  of  cues)  had  no  effect, 
possibly  because  rank  ordering  is  as  precise  as  the  huautn  Judge  can  be  in  this 
task.  In  any  case,  the  two  experiments  provided  ample  opportunity  for  the  study 
of  Induced  decresMsnts  in  judgment  accuracy. 

Two  principal  aspects  of  judgment  were  of  interest:  how  closely  DM's 
cue-weighting  policy  corresponded  to  the  optlsial  rule  (knowledge  or  G),  and  how 
consistently  he  was  able  to  apply  his  own  policy  In  making  judgments  (control 
or  C).  The  latter  concept  was  also  Indexed  using  a  third  measure  (R),  but 


since  It  turned  out  to  be  largely  redundant  with  the  other  two,  it  will  not  be 
discussed  further. 

The  main  finding  with  respect  to  these  "process"  measures  was  that 
different  means  of  inducing  decrements  do,  in  fact,  seem  to  operate  through 
somewhat  different  cognitive  mechanisms.  In  particular,  adding  to  the 
information  load  (cues  to  be  integrated)  affects  DM's  ability  to  formulate  an 
appropriate  weighting  strategy,  even  though  the  necessary  Information  is 
provided  explicitly.  This  is  what  one  would  expect  if,  in  the  language  of 
Hammond's  Cognitive  Continuum  Theory,  the  judge  were  to  rely  on  an 
intuitive  processing  mode:  "knowing”  the  proper  rule  does  little  to  help 
him  make  better  intuitive  judgments  because  he  is  forced  to  use  a  simpler 
strategy.  On  the  other  hand,  reducing  the  time  available  and/or  increasing  the 
complexity  of  the  weighting  rule  seems  to  have  a  greater  impact  on  the 
application  chan  on  the  formulation  of  a  proper  weighting  policy.  Performance 
breaks  down  because  DM  has  difficulty  carrying  out  his  own  preferred  strategy 
with  any  consistency.  Over  a  large  number  of  judgments,  he  may  accord  each  cue 
its  proper  weight,  but  in  a  particular  instance,  he  simply  has  trouble 
Integrating  the  various  cue  values.  Again,  in  Hammond's  terminology,  it  is  as 
though  he  were  operating  in  a  proper  analytic  mode,  but  was  unable  to  do 
all  the  required  mental  calculations  in  the  time  allowed. 

The  above  account  is,  of  course,  an  oversimplification  of  the 
results— both  G  and  C  components  were  involved  to  an  extent  in  most  of  the 
Induced  decrements.  Furthermore,  for  reasons  that  are  not  entirely  clear, 
performance  was  somewhat  higher  overall  in  Exp.  2  than  in  Exp.  1— even  on 
Identical  conditions;  hence  direct  comparisons  between  studies  must  be  viewed 
cautiously.  Nevertheless,  the  fact  that  different  patterns  of  results  emerged 
from  the  two  studies,  and  they  quite  possibly  represent  different  kinds  of 
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Footnotes 

1.  Table  1  suggests  that  an  equal  weighting  rule  is  simpler  cognitively 
than  an  unequal  one,  although  evidence  for  this  assumption  is  sparse. 

2.  The  G  x  C  index  is  a  predicted  achievement  score  based  on  the  product 
of  obtained  G  and  C  scores. 
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processing  deficits  (albeit  produced  within  the  sane  basic  task  framework) ,  is 
an  encouraging  finding.  Future  experimentation  should,  among  other  things, 
seek  to  replicate  the  time  and  quantity  functions  in  factoral  combination 
within  the  same  study. 

Should  the  present  findings  hold  up  under  replication,  it  would  suggest 
that  system  designers  should  focus  on  different  aiding  concepts  depending  upon 
how  a  particular  task  "stresses"  the  DM.  For  example,  if  he  must  deal  with  a 
large  number  of  predictive  items,  some  form  of  "divide  and  conquer"  strategy 
might  be  most  appropriate  (to  minimize  the  loss  from  "intuitive" 
simplification).  If,  on  the  other  hand,  time  pressure  is  the  main  task  demand, 
"bootstrapping"  might  be  preferable. 

The  idea  of  tailor-making  aiding  concepts  to  generic  task  forms 
(diagnosis,  action  selection,  etc.)  is  not,  of  course,  new.  What  is  suggested 
here,  however,  is  that  different  concepts  might  be  implemented  within  the 
same  basic  task  scenario  depending  upon  which  features  are  most 
troublesome.  As  the  nature  of  the  deficits  produced  by  specific  task 
dimensions  becomes  more  clearly  established,  the  plausibility  of  this  approach 
to  aiding  can  be  subjected  to  more  rigorous  test. 
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Table  1 


Complexity  of 

Inducing  Intuition 

1 .  Texture  of  judgment  scale 

A.  Many  alternatives 

B.  Many  steps  to  solution 

2.  Number  of  cues  presented* 

A.  Many  cues  ( 5  or  more) 

B.  Contemporaneously  displayed 

3.  Vicarious  mediation  - 
Intra-ecological  correlations 
large  (R-.5)  degree  (horizontally) 

4.  Cue  distribution  characteristics 

A.  Normal 

B.  Linear  function 
5-  Weights  -  equal** 

6.  Organizing  principle  -  linear 

Ambiguity  of 

Inducing  Intuition 

1.  Organizing  principle  not  available* 

2.  Task  outcome  not  available 

3.  Unfamiliar  content 

4.  Feedforward 

A.  No  training 

B.  No  information 

5.  Feedback  -  minimal 

Form  of  Task 

Inducing  Intuition 

1.  Task  decomposition  -  a  posteriori 

2.  Cognitive  decoraposition- 
a  posteriori 

3.  Type  of  cue  data  -  continuous 

4.  Type  of  cue  definition 

A.  Pictorial 

B.  Subject  measures  cue  levels 

5.  Response  time  permitted  or  implied- 
brief ** 


Task  Structure 

Inducing  Analysis 

1 .  Texture  of  judgment  scale 

A.  Few  alternatives 

B.  Few  steps 

2.  Number  of  cues  presented 

A.  Few  Cues  (2-4) 

B.  Sequentially  encountered 

3.  Vicarious  mediation  - 
Intra-ecological  correlations 
minimal  degree  (vertically) 

4.  Cue  distribution  characteristics 

A .  Peaked 

B.  Nonlinear,  nonmonotonic  function 

5.  Weights  -  unequal 

6.  Organizing  principle  -  nonlinear 

Task  Content 
Inducing  Analysis 

1.  Organizing  principle  available 

2.  Task  outcome  available 

3.  Highly  familiar  content 

4.  Feedforward 

A.  Prior  skill 

B.  Information 

5.  Feedback  -  cognitive  feedback 
Presentation 

Inducing  Analysis 

1.  Task  decomposition  -  a  priori 

2.  Cognitive  decompositlon- 
a  priori 

3.  Type  of  cue  data  -  dichotomous 

4.  Type  of  cue  definition 

A.  Pictorial 

B.  Objective  measures 

5.  Response  time  permitted  or  implied- 
open 


*  Manipulated  in  Experiment  1 . 
**  Manipulated  in  Experiment  2. 


*  *  J.*! . 

t.  ’-A*  I-./ _ _ ..  . 


Table  3 


Mean  Scores  Obtained  under  the  Six  Experimental 
Conditions  in  Experlaient  1  on  All  Five  Measures 


Measure 


Product 


Experlsiental 

Condition  1 

Hit  Rate(Z) 

Ach.(r) 

Strategy  Not  Available  (Group  1) 

3  cues 

49 

.93 

4  cues 

38 

.82 

5  cues 

36 

.75 

collapsed  over 

quantity 

41 

.83 

Strategy  Available  (Group 

II) 

3  cues 

59 

.89 

4  cues 

39 

.79 

5  cues 

39 

.73 

collapsed  over 

quantity 

46 

.80 

Collapsed  Over 

Groups 

3  cues 

54 

.91 

4  cues 

39 

.80 

5  cues 

38 

.74 

collapsed  over 

quantity 

44 

.82 

Process  (r) 


Knowledge(G) 

Control (C) 

(GxC) 

.98 

.95 

.93 

.89 

.92 

.82 

.83 

.92 

.76 

.90 

.93 

.84 

.97 

.92 

.89 

.92 

.85 

.78 

.85 

.85 

.72 

.91 

.87 

.80 

.97 

.94 

.91 

.91 

.89 

.81 

■  84 

.89 

.75 

.91 

.91 

.82 

Table  4 

Weighting  Strategies  -  Experiment  2 


Skill  Ratings 

4  equal 

3  equal 

2  equal 

Typing  Speed 

.25 

.40 

.50 

Language  Proficiency 

.25 

.20 

.20 

Telephone  Usage 

.25 

.20 

.20 
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Table  5 


Mean  Scores  Obtained  under  the  Twelve  Experimental 
Conditions  in  Experiment  2  on  All  Five  Measures 

Measures 

Product  _ Process  (  r) 


Experimental 


Condition 

Hit  Rate(X) 

Ach(r) 

Knowledge(G) 

Control(C) 

(GxC) 

4  Equal  Wgt. 

Strategy  (Group 

1) 

1 5  sec . 

74 

.94 

.98 

.96 

.94 

10  sec. 

71 

.89 

.98 

.91 

.89 

5  sec. 

54 

.83 

.97 

.85 

.82 

collapsed  over  times  66 

.89 

.98 

.91 

.88 

3  Equal  Wgt. 

Strategy  (Group 

II) 

15  sec. 

78 

.95 

.99 

.96 

.95 

10  sec. 

70 

.94 

.98 

.96 

.94 

5  sec. 

56 

.91 

.99 

.92 

.91 

collapsed  over  times  68 

.93 

.99 

.95 

.93 

2  Equal  Wgt. 

Strategy  (Group 

III) 

15  sec. 

49 

.91 

.98 

.93 

.91 

10  sec. 

48 

.86 

.97 

.89 

.86 

5  sec. 

36 

.87 

.96 

.87 

.84 

collapsed  over  times  44 

.87 

.97 

.90 

.87 

0  Equal  Wgt. 

Strategy  (Group 

IV) 

15  sec. 

44 

.88 

.98 

.90 

.88 

10  sec. 

43 

.88 

.95 

.93 

.88 

5  sec. 

41 

.84 

.97 

.87 

.87 

collapsed  over  times  43 

.88 

.97 

.90 

.88 
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ENVIRONMENT 


SUBJECT 


Figure  1.  Brunswik's  lens  model. 


J 


Figure  5:  Time  interval  function  for  the  main  product  and  process  measures  collapsed 
over  groups  (note  2). 
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